Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Proposition to distinguish Machine-Printed from Handwritten Arabic and Latin Words

Identifieur interne : 000C50 ( Main/Exploration ); précédent : 000C49; suivant : 000C51

Proposition to distinguish Machine-Printed from Handwritten Arabic and Latin Words

Auteurs : Asma Saïdani [Tunisie] ; Afef Kacem Echi [Tunisie] ; Abdel Belaïd [France]

Source :

RBID : Hal:hal-01112678

Descripteurs français

English descriptors

Abstract

—In this work, we gathered some contributions to identify script and its nature. We successfully employed many features to distinguish between handwritten and machine-printed Arabic and Latin scripts at word level. Some of them are previously used in the literature, and the others are here proposed. The new proposed structural features are intrinsic to Arabic and Latin scripts. The performance of all extracted features is studied towards this paper. We also compared the performance of three classifiers: Bayes (AODEsr), k-Nearest Neighbor (k-NN) and Decision Tree (J48), used to identify the script at word level. These classifiers have been chosen enough different to test the feature contributions. We carried experiments using standard databases. Obtained results demonstrate used feature capability to capture differences between scripts. Using a set of 58 selected features and a Bayes-based classifier, we achieved an average identification rate equals to 98.72%, which considered a very satisfactory rate compared to some related works.

Url:


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Proposition to distinguish Machine-Printed from Handwritten Arabic and Latin Words</title>
<author>
<name sortKey="Saidani, Asma" sort="Saidani, Asma" uniqKey="Saidani A" first="Asma" last="Saïdani">Asma Saïdani</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-39219" status="VALID">
<orgName>Technologie de l'Information et de la Communication</orgName>
<orgName type="acronym">UTIC</orgName>
<desc>
<address>
<addrLine>5, Avenue Taha Hussein, B. P. : 56, Bab Menara, 1008 Tunis</addrLine>
<country key="TN"></country>
</address>
<ref type="url">http://www.esstt.rnu.tn/utic/</ref>
</desc>
<listRelation>
<relation active="#struct-310013" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-310013" type="direct">
<org type="institution" xml:id="struct-310013" status="INCOMING">
<orgName>École Supérieure des Sciences et Technologies de Tunis</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Tunisie</country>
</affiliation>
</author>
<author>
<name sortKey="Kacem Echi, Afef" sort="Kacem Echi, Afef" uniqKey="Kacem Echi A" first="Afef" last="Kacem Echi">Afef Kacem Echi</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-39219" status="VALID">
<orgName>Technologie de l'Information et de la Communication</orgName>
<orgName type="acronym">UTIC</orgName>
<desc>
<address>
<addrLine>5, Avenue Taha Hussein, B. P. : 56, Bab Menara, 1008 Tunis</addrLine>
<country key="TN"></country>
</address>
<ref type="url">http://www.esstt.rnu.tn/utic/</ref>
</desc>
<listRelation>
<relation active="#struct-310013" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-310013" type="direct">
<org type="institution" xml:id="struct-310013" status="INCOMING">
<orgName>École Supérieure des Sciences et Technologies de Tunis</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Tunisie</country>
</affiliation>
</author>
<author>
<name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-206042" status="VALID">
<orgName>Recognition of writing and analysis of documents</orgName>
<orgName type="acronym">READ</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-423086" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-423086" type="direct">
<org type="department" xml:id="struct-423086" status="VALID">
<orgName>Department of Natural Language Processing & Knowledge Discovery</orgName>
<orgName type="acronym">LORIA - NLPKD</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/Knowledge-and-Language-Management</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-01112678</idno>
<idno type="halId">hal-01112678</idno>
<idno type="halUri">https://hal.archives-ouvertes.fr/hal-01112678</idno>
<idno type="url">https://hal.archives-ouvertes.fr/hal-01112678</idno>
<date when="2014-03-06">2014-03-06</date>
<idno type="wicri:Area/Hal/Corpus">003E11</idno>
<idno type="wicri:Area/Hal/Curation">003E11</idno>
<idno type="wicri:Area/Hal/Checkpoint">000B92</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">000B92</idno>
<idno type="wicri:Area/Main/Merge">000C56</idno>
<idno type="wicri:Area/Main/Curation">000C50</idno>
<idno type="wicri:Area/Main/Exploration">000C50</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Proposition to distinguish Machine-Printed from Handwritten Arabic and Latin Words</title>
<author>
<name sortKey="Saidani, Asma" sort="Saidani, Asma" uniqKey="Saidani A" first="Asma" last="Saïdani">Asma Saïdani</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-39219" status="VALID">
<orgName>Technologie de l'Information et de la Communication</orgName>
<orgName type="acronym">UTIC</orgName>
<desc>
<address>
<addrLine>5, Avenue Taha Hussein, B. P. : 56, Bab Menara, 1008 Tunis</addrLine>
<country key="TN"></country>
</address>
<ref type="url">http://www.esstt.rnu.tn/utic/</ref>
</desc>
<listRelation>
<relation active="#struct-310013" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-310013" type="direct">
<org type="institution" xml:id="struct-310013" status="INCOMING">
<orgName>École Supérieure des Sciences et Technologies de Tunis</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Tunisie</country>
</affiliation>
</author>
<author>
<name sortKey="Kacem Echi, Afef" sort="Kacem Echi, Afef" uniqKey="Kacem Echi A" first="Afef" last="Kacem Echi">Afef Kacem Echi</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-39219" status="VALID">
<orgName>Technologie de l'Information et de la Communication</orgName>
<orgName type="acronym">UTIC</orgName>
<desc>
<address>
<addrLine>5, Avenue Taha Hussein, B. P. : 56, Bab Menara, 1008 Tunis</addrLine>
<country key="TN"></country>
</address>
<ref type="url">http://www.esstt.rnu.tn/utic/</ref>
</desc>
<listRelation>
<relation active="#struct-310013" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-310013" type="direct">
<org type="institution" xml:id="struct-310013" status="INCOMING">
<orgName>École Supérieure des Sciences et Technologies de Tunis</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Tunisie</country>
</affiliation>
</author>
<author>
<name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-206042" status="VALID">
<orgName>Recognition of writing and analysis of documents</orgName>
<orgName type="acronym">READ</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-423086" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-423086" type="direct">
<org type="department" xml:id="struct-423086" status="VALID">
<orgName>Department of Natural Language Processing & Knowledge Discovery</orgName>
<orgName type="acronym">LORIA - NLPKD</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/Knowledge-and-Language-Management</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="mix" xml:lang="en">
<term>Arabic/Latin script</term>
<term>Classification</term>
<term>Feature extraction</term>
<term>machine-printed/handwritten word</term>
<term>— Script and nature identification</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">—In this work, we gathered some contributions to identify script and its nature. We successfully employed many features to distinguish between handwritten and machine-printed Arabic and Latin scripts at word level. Some of them are previously used in the literature, and the others are here proposed. The new proposed structural features are intrinsic to Arabic and Latin scripts. The performance of all extracted features is studied towards this paper. We also compared the performance of three classifiers: Bayes (AODEsr), k-Nearest Neighbor (k-NN) and Decision Tree (J48), used to identify the script at word level. These classifiers have been chosen enough different to test the feature contributions. We carried experiments using standard databases. Obtained results demonstrate used feature capability to capture differences between scripts. Using a set of 58 selected features and a Bayes-based classifier, we achieved an average identification rate equals to 98.72%, which considered a very satisfactory rate compared to some related works.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
<li>Tunisie</li>
</country>
<region>
<li>Grand Est</li>
<li>Lorraine (région)</li>
</region>
<settlement>
<li>Metz</li>
<li>Nancy</li>
</settlement>
<orgName>
<li>Université de Lorraine</li>
</orgName>
</list>
<tree>
<country name="Tunisie">
<noRegion>
<name sortKey="Saidani, Asma" sort="Saidani, Asma" uniqKey="Saidani A" first="Asma" last="Saïdani">Asma Saïdani</name>
</noRegion>
<name sortKey="Kacem Echi, Afef" sort="Kacem Echi, Afef" uniqKey="Kacem Echi A" first="Afef" last="Kacem Echi">Afef Kacem Echi</name>
</country>
<country name="France">
<region name="Grand Est">
<name sortKey="Belaid, Abdel" sort="Belaid, Abdel" uniqKey="Belaid A" first="Abdel" last="Belaïd">Abdel Belaïd</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C50 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000C50 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Hal:hal-01112678
   |texte=   Proposition to distinguish Machine-Printed from Handwritten Arabic and Latin Words
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022